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Bounds  for  the  Spectral  Norm  of  Functions  of  Matrices* 

By 

Jean  Descloux  ** 

Introduction 

Let   A    be   a  square   matrix   of   older   n   with   complex   elements   a,,    and 

distinct   eigenvalues    X1,...,Xm    of   multiplicity   %,...,«„,,(  £  »,■=»).    v(A)  = 

(/  —  Ai)"1^—  A2)"'...(A—  AJ0"1    denotes    the    minimal    function    of    A,    (v^n,, 
1  =  1,  . . . ,  w) . 

Let  ^>(z)  and  ^(z)  be  two  polynomials  with  complex  coefficients;  from  the 
minimal  property  of  xp{z)  one  establishes  easily  that  p(A)  =  q(A)  if  and  only  if 

p{z)  =  q(z)         (mody>(z));  (1) 

(1)  is  equivalent  to  the  set  of  conditions 

/>(t)(A7.)=?(,)(A;),        i=0A,...,vj-\,j^\,2,...,m,  (2) 

where  p{l]{X)  is  the  ith  derivative  of  p{K)  (in  particular  p^{X)  =  p{X)). 

We  shall  follow  Reference  2  for  the  definition  of  a  general  function  of  matrices. 
We  suppose  that  f(z)  is  a  function  for  which  the  quantities 

/(l)(A;),        i=Q,\,...,v1—\;  j  =  \,...,m, 
are  defined  (if  all  the  eigenvalues  are  real,  the  domain  of  f{z)  may  be  only  a  subset 
of  the  real  numbers).  There  exists  a  polynomial  p  (z)  (for  example  the  Lagrange- 
Sylvester  interpolation  polynomial  (Reference  2))  such  that 

^>(A;.)=/("(A;),        *  =  o,l Vi-\;  i  =  \,...,m;  (3) 

then  we  set  by  definition 

f(A)=P(A); 

(2)  shows  the  coherence  of  this  definition. 

For  the  effective  computation  of  f(A)  one  usually  determines  a  polynomial 
p{z)  that  satisfies  (3)  only  approximately;  norms  of  the  matrix  f(A)  —  p{A)  give 
a  measure  of  the  accuracy  of  the  approximation. 

In  this  report  we  have  tried  to  find  bounds  for  the  spectral  norm  of  functions 
f  matrices  which  are  suitable  for  the  purpose  of  approximation,  i.e.,  if  the  norm 
of  the  "error"  is  "small"  the  bound  must  also  be  "small".  The  theorem  and  its 
corollary  are  generalizations  of  Henrici's  result  for  the  function  f{z)  =  zk  (Refer- 
ence 3)  and  make  use  of  the  notion  of  "departure  from  normality". 

*  This  work  was  supported  in  part  by  the  National  Science  Foundation  under 
Grant  G  16489. 

I  am  indebted  to  Peter  Henrici  for  useful  advice  and  to  Don  Frazer  who 
has  improved  the  language  of  this  text. 

♦Present  address :  Institut  de  Mathematiques  Appliquees  de  1'Ecole  Polytechnique 
de  TUniversite  de  Lausanne  (Suisse),  Avenue  de  Cour  33. 
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Notation 

Besides  the  symbols  introduced  in  the  first  paragraph  we  shall  use  the  following 
notation:  , 

/W,>  •••-//«  are  the  (not  necessarily  different)  eigenvalues  of  4 ;  of  course 

each  of  the  u/s  is  equal  to  one  of  the  A/s.  

If  J  is  a  vector  of  components  *, ,*,....,*„.  |2  |=VZ  N2  *»  the  Euchdian 
norm  of  x . 

o{A)  =  max  \\Ax  fl  is  the  spectral  norm  of  A  . 

11^=111 
y  H^l/^T^TTP  is  the  Euclidian  norm  of  ^  • 

A{A)  =yp|2Zr2|^|a    is  the  departure  from  normality  of  4  ; 

it  is  known  that  A{A)  =  0  if  and  only  if  A  is  normal. 

For  any  real  or  complex  function  g(zlt  z„  . . .,  zt)  is  the  (i-l)th  divided  dif- 
ference of  g(z)  at  the  points  zx,  ....  zf.  In  the  case  of  repeated  arguments  (con- 
fluent case),  if  H  arguments  are  equal  to  zx,  i2  arguments  are  equal  to  *„  ...» 
arguments  are  equal  to  z,  &+<,...  +  »»=•)•  we  define  (Reference  4) 

g(«i.*i.---.«i)==-(<l_i)!(ti-i)!. ..(**-!)!  e^pltepri&p7  s      ' 

provided  that  the  derivatives  involved  in  this  relation  exist,  i.e., 
gM(Zj),         p=0,  \,  ••■,  if— I,  /  =  1>2>  ■■■>k- 

Bounds  for  a  (/(A)) 

Suppose  that  for  /(*)  the  following  values  exist: 

f>\xt),      »=o, i, ...,«,— i;  /=i, ■•■»»; 

since  v^^,  /(4)  is  defined. 

Theorem.   LctVi=        max        1/^, /^,  ■  •  • , /^)|.  ^^  2,  . . -,  n;  then 

tfii1<t2...<i*^» 

a(f(A))<i\(A(A)Y. 

flm^s.   1.  If  A  is  normal,  zl(^)  =  0  and  consequently  a  {f(A))^d0;  actually 

it  is  easy  to  show  that  a(f  {A))  =  <V 

2  If  the  characteristic  function  of  A  is  equal  to  its  minimal  function  (i.e. 
v.=n.,j  =  \,.  ...m)  and  if  a{f(A))=0,  the  bound  given  by  the  theorem  is  alsoO 
indeed  <r(/(i))=0  only  if 

/(*)(A;)=o,      »=o, i,. ..,«,— i;  /=i,-..,»; 

but   this  implies  that   Oj  =  0,  .=0. 1,  ....n-i.    When  the  characteristic  anc 

n^mmal  functions  do  no!  coincide,  /  (A)  may  vanish  and  the  bound  be  positive 

Corollary.    Suppose  that  /(*)  is  an  analytic  function  m  the  region  D  of  th 

complex    plane    defined    as    the    convex    hull    of    the    eigenvalues    K,...,K 

(n^Vjtc^-.c^O;  2^=4)'  andl6t 

^=max|/<*>(*)|. 
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Then 

n-l 


k  =  0       ' 

Proof  of  the  Corollary.    It  suffices  to  show  that 


<5*^a,  ,         k=o,i,...,n 


kl'         K=0,i,...,n—i.  (4) 

For  k=0,  (4)  is  clear.    Suppose  ^>0  and  let  E  be  the  region  of  the  convex  hull 

of  /W<2 l*k+i,  we  use  Hermite's  form  of  divided  differences  (Reference  4) 

,.  }        '1  fe-, 

/(/'1.  •  •  -./«*+ J  =  Jd(1fdt2---f  /<*>(*)  dtk  K) 

0         0  0  v  ' 

wherez=^-^^+^-^)^+-"+fe-1-^)^+^^+1.   The  domain  of  inte- 
gration is  0^tk£th_1^...t1£i;  with  the  new  variables  e1  =  i-t1,et=t1-tt,  ..., 

ek  =  h-i  —  tk,  ek+1=tk,  the  domain  of  integration  is  given  by  e  ^0,  Ye=i,  and 

consequently  "during  the  integration"  the  variable  z  takes  exactly  affthe  values 
of  the  region  E;  we  have 


I  h  tk-i 


but  ECD;  consequently  ° 

|/(/'i.---./^+i)|^^7^; 

since  in  the  relation,  we  can  replace  fh,  ...,fh+1  by  any  choice  of  k  +  l  of  the 
»  s,  (4)  follows. 

Proof  of  the  Theorem.  By  the  Lagrange- Sylvester  method,  it  is  possible  to 
construct  a  polynomial  g  (z)  such  that 

S(l)(^-)=/(t)(A7),  ,  =  0,. ..,„.-!;    j  =  i,...,m; 

since  «,,£«,  it  follows  that  *(il)=/(ii);  furthermore  the  coefficients  «.  of  the 
theorem  are  the  same  for  g(z)  and  /(*)  and  the  bounds  for  a(g(A))  and  a(flA)) 
mil  be  equal.  Consequently,  we  can  restrict  the  proof  of  the  theorem  to  the  case 
of  polynomials,  and  in  the  following  /(*)  will  be  considered  as  a  polynomial  We 
shall  use  some  lemmas  : 

Lemma  1.   a)  o{A  +  B)^a(A)  +  a(B);  a(AB)^o(A)  •  a(B) 
b)  a(A)^\\A\\. 

I  If  B=UAU-\  where  U  is  unitary,  then  o(A)=o(B)  and  A(A)=A(B). 
d)  If  B  has  the  non-negative  elements  bt ,  such  that  |  «i;.|  ^  J, . ,  then  a(A)  ^ a{B) . 
Proof.   Lemma  1  a,  1  b,  1  c  are  well-known  (Reference  3) 

w  w74°  Pr°Ve  kmma  '  d'  C°nsider  the  vector  *  of  components  %.    .       x 
;uch  that  \\x  ||  =  l  and  ff(^)  =  |^||.  we  have  the  inequalities    F  *'       '    " 

<y(A)  =  \\Ax\\^\\By\\^o{B),. 
^here  y  has  the  components  |  xx  \ ,  \  x2 1 ,  . . . ,  |  x„  \ . 

Lemma  2.  The  (k-i)»  divided  difference  /(*,,  ,„  ...,  h)  of  the  polvnomial 
^  is  a  continuous  function  of  all  its  complex  arguments  z1,  z2,  . ..,  zk. 
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Proof    This  is  a  direct  consequence  of  Hermite's  form  of  the  divided  dif 
ference  which  is  also  valid  for  repeated  arguments  (see  (5)  and  Reference  4). 
Lemma  3.  There  exists  an  upper-triangular  matrix  T  and  a  unitary  matrix  [7 

SUChthat  A-UTU+; 

T,  which  is  generally  not  unique,  is  a  Schur  triangular  form  (see  Reference  5). 

Notations 
In  the  following,  T  will  denote  a  Sehur  triangular  form  of  4     Letting  <(, 
be  the  elements  of  T,  we  define  the  diagonal  matrix  D  of  elements  *,,.  and  the 
matrix  M  of  elements  mif  by  the  relations: 
d    -t  •■ 

Wi.  =  0     for     i^j;         ^j  =  tif     for     »</; 
it  follows  that  T  =  D+M;  since  the  hi>s  are  the  eigenvalues  of  A,  we  can  suppose 

that  tit=fti,i  =1.2 n. 

We  introduce  the  following  symbols: 

M  is  the  matrix  of  elements  \mti\', 
wWare  the  elemts  of  M*; 
m$  are  the  elements  of  M*; 

»[»,/]  =  »»</; 

for  the  divided  difference  of  the  polynomial  f(z)  at  the  point  z{,  z,,  .. .,  zk,  we 

write:  .   .  h, 

f(zl,zj>...,zk)=f(z:t,h---'k)- 

Lemma  4.   For  k>0: 

^.+1>  =  0     for    j—i-g>k) 


mi 


. 


i<p!<Pt—<Pk<i 


hC*S.  -'oT^U  —  te  »-,.    Suppose  they  are  ahead, 
verified  for  &^r;  then 

<-+2)=2>.s<+1; 

s=l 

(6)  follows  immediately  for  k=r+l ;  suppose  /-Or+i  : 


S  =  i  +  1  S<Pl<.:<Pr<j 

=  m[i,s,p1,.--,pr>i] 

i<S<pl<-<Pr<i 

j</),<#'2<..-</''+l<? 


Bounds  for  the  Spectral  Norm  of  Functions  of  Matrices  189 

Lemma  5.   The  elements  wtj  of  the  matrix  W=f{T)  satisfy  the  relations 
wti  =  Q     for    i>j;  ,m 

wu =/(/'.)-         i  =  \,2,...,n;  ^ 

»M+i  =  m[i,  i+i]f(fti,fJti+1),         i  =  1,  2,  . . . ,  »—- 1 ; 
»<..-+!  =  »R  *'+  2]  /(/,,.,/,, ,  2)  +  «,[,-,  *+l,*+  2]  /(y,,  ,/f/+1,/#,+  8) ; 
generally  for  />» 


(10) 


:2  2  m[*>Pi,P%,  ■■•,pk,  i\t{fi:i,  Pi,...,  4>h,i\ 

ft=0  i<px<pt<...<pk<j  rxniM         rx,        ,fh>n 


(11 


Proo/.    We  first  prove  the  lemma  5  when  all  the  eigenvalues  ^  «    are 

different.  The  relations  (8)  and  (9)  are  immediately  established  for  powers"  and 
consequently  also  for  polynomials.  We  prove  (11)  by  recurrence  successively  for 
/ —  '  = 1  - 1  ~  1  =  2,  •  • .  using  the  commutative  property 

TW=WT=Q;  (12) 

let  qxj  be  the  elements  of  Q.  If  we  write  (12)  for  the  elements  q  , ,  ,  we  find  (10) 
easily.  Suppose  the  general  relation  (11)  is  true  for  i^j-i^u;  for  an  element 
qrs  with  s—r=u+i,  (12)  becomes 


S-l 

l  =  r4-l  .  .      'IT" 


s-1 


'='+*  ,=7+1 

but  *,,=//,,  Wil  =  f(Mi),  tlj  =  mlj  for/>*  and  for  all  », .  occurring  in  the  summa- 
tions, we  can  apply  the  hypothesis  of  recurrence : 

**M-/*M)  =  '»r.(f(tl,)-f{flt))  + 

we  change  the  name  of  the  subscripts  i  and  /;  since  m,sm[r,  A  A,  fl  - 

"•['.A.  ••-.  &./.  s],  we  have  7       L    "        'Vk'U 

wrs^r-^)  =  mrs(f{fir)-f{[Xs))  + 

s-r-l 

+  2  2         m[r,p1 pk,s]f(fi:r,pi,...,pk)- 

s-r-l 

-2  2        mlr>Pi,---,Pk,s]f(ji:j>1,...,j>k,s); 

k=l   r<p1<...<:plc<s  ' 

hiding  both  sides  by  ^-^  and  using  the  definition  of  the  divided  difference, 
>\e  get 

s-r-l 

wrs=  2  2        ™[r>Pi,---,pk,s\f{ti:r,i>1,...,i>k,s)- 

A=0  r<pl<...<pk<s  rl  rn>     i> 
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the  proof  of  the  lemma  is  finished  for  matrices  with  distinct  eigenvalues.  Suppose 
now  that  A  has  some  multiple  eigenvalues  and  let  T'  =  D'  +  M,  where  D'  is  a 
diagonal  matrix  of  diagonal  elements  &,&.....*,..  all  different.  The  elements 
w'-oi  the  matrix  W'  =  f(T)  are  given  by  (8),  (9),  (10),  (11)  if  we  replace  fi(  by  £ 

for  i=12         n.   Let  f,-*^  for  i  =  1 ,  2 n ;  by  continuity  the  elements  ^,. 

converge  to  the  elements  of  f(T)  and  by  lemma  2,  the  limits  are  given  respect- 
ively by  (8),  (9),  (10),  (H),  which  is  the  desired  result. 

Now  it  is  easy  to  prove  the  theorem;  by  lemmas  4  and  5,  we  have 

Wlj  =  0     for     i>j; 

|w«l=A   for   »  =  i.2,..-,»; 

i^/i^1rVK^ri)=22v1^+i)  f°r  ?-*>o; 

by  lemma  1  a,  1  d,  if  I  denotes  the  unity  matrix: 

V       '  \  ft=l  /        A  =  0 

by  lemma  1  b,  1  c,  we  have 

o(M)^\\M\\  =  A(T)  =  A(A); 

finally,  since  f(A)  =  f{UTU^)  =  Uf(T)  U'\  we  have,  by  lemma  1  c, 

o(f(A))  =  o{W)^dk(A{A))\ 
which  is  the  desired  result. 


: 

3. 
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